AITopics

2511.07778

Genre: Research Report > New Finding (0.66)

Industry:

Transportation > Ground > Road (0.67)
Automobiles & Trucks (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Neural Information Processing SystemsAug-19-2025, 17:13:37 GMT

ed3c686f9cda57e56cc859402c775414-Supplemental-Conference.pdf

artificial intelligence, communication, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.04)

Genre: Workflow (0.47)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

arXiv.org Artificial IntelligenceJun-16-2025

DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration

Zhang, Hanzhi, Fan, Heng, Sha, Kewei, Huang, Yan, Feng, Yunhe

Long-context understanding is crucial for many NLP applications, yet transformers struggle with efficiency due to the quadratic complexity of self-attention. Sparse attention methods alleviate this cost but often impose static, predefined masks, failing to capture heterogeneous attention patterns. This results in suboptimal token interactions, limiting adaptability and retrieval accuracy in long-sequence tasks. This work introduces a dynamic sparse attention mechanism that assigns adaptive masks at the attention-map level, preserving heterogeneous patterns across layers and heads. Unlike existing approaches, our method eliminates the need for fine-tuning and predefined mask structures while maintaining computational efficiency. By learning context-aware attention structures, it achieves high alignment with full-attention models, ensuring minimal performance degradation while reducing memory and compute overhead. This approach provides a scalable alternative to full attention, enabling the practical deployment of large-scale Large Language Models (LLMs) without sacrificing retrieval performance. DAM is available at: https://github.com/HanzhiZhang-Ulrica/DAM.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2506.11104

Country:

North America > United States > Texas (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

arXiv.org Artificial IntelligenceFeb-5-2025

Lost in Edits? A $\lambda$-Compass for AIGC Provenance

You, Wenhao, Hooi, Bryan, Wang, Yiwei, Choo, Euijin, Yang, Ming-Hsuan, Yuan, Junsong, Huang, Zi, Cai, Yujun

Recent advancements in diffusion models have driven the growth of text-guided image editing tools, enabling precise and iterative modifications of synthesized content. However, as these tools become increasingly accessible, they also introduce significant risks of misuse, emphasizing the critical need for robust attribution methods to ensure content authenticity and traceability. Despite the creative potential of such tools, they pose significant challenges for attribution, particularly in adversarial settings where edits can be layered to obscure an image's origins. We propose LambdaTracer, a novel latent-space attribution method that robustly identifies and differentiates authentic outputs from manipulated ones without requiring any modifications to generative or editing pipelines. By adaptively calibrating reconstruction losses, LambdaTracer remains effective across diverse iterative editing processes, whether automated through text-guided editing tools such as InstructPix2Pix and ControlNet or performed manually with editing software such as Adobe Photoshop. Extensive experiments reveal that our method consistently outperforms baseline approaches in distinguishing maliciously edited images, providing a practical solution to safeguard ownership, creativity, and credibility in the open, fast-evolving AI ecosystems.

artificial intelligence, machine learning, natural language, (18 more...)

2502.04364

Country:

North America > Canada > Alberta (0.14)
Oceania > Australia > Queensland (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Law (0.93)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Huang, Chunyang, Zhang, Shaoliang

Explainable artificial intelligence model for identifying Market Value in Professional Soccer Players

arXiv.org Artificial IntelligenceNov-23-2023

This study introduces an advanced machine learning method for predicting soccer players' market values, combining ensemble models and the Shapley Additive Explanations (SHAP) for interpretability. Utilizing data from about 12,000 players from Sofifa, the Boruta algorithm streamlined feature selection. The Gradient Boosting Decision Tree (GBDT) model excelled in predictive accuracy, with an R-squared of 0.901 and a Root Mean Squared Error (RMSE) of 3,221,632.175. Player attributes in skills, fitness, and cognitive areas significantly influenced market value. These insights aid sports industry stakeholders in player valuation. However, the study has limitations, like underestimating superstar players' values and needing larger datasets. Future research directions include enhancing the model's applicability and exploring value prediction in various contexts.

arxiv template, market value, prediction, (15 more...)

2311.04599

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.91)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)

#artificialintelligenceApr-24-2022, 07:20:17 GMT

Feature Transformation

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. The life cycle of the Machine Learning model can be broken down into the following steps.

dataset, feature transformation, transformation, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Chowdhury, Shovan, Lin, Yuxiao, Liaw, Boryann, Kerby, Leslie

Evaluation of Tree Based Regression over Multiple Linear Regression for Non-normally Distributed Data in Battery Performance

arXiv.org Artificial IntelligenceNov-3-2021

Battery performance datasets are typically non-normal and multicollinear. Extrapolating such datasets for model predictions needs attention to such characteristics. This study explores the impact of data normality in building machine learning models. In this work, tree-based regression models and multiple linear regressions models are each built from a highly skewed non-normal dataset with multicollinearity and compared. Several techniques are necessary, such as data transformation, to achieve a good multiple linear regression model with this dataset; the most useful techniques are discussed. With these techniques, the best multiple linear regression model achieved an R^2 = 81.23% and exhibited no multicollinearity effect for the dataset used in this study. Tree-based models perform better on this dataset, as they are non-parametric, capable of handling complex relationships among variables and not affected by multicollinearity. We show that bagging, in the use of Random Forests, reduces overfitting. Our best tree-based model achieved accuracy of R^2 = 97.73%. This study explains why tree-based regressions promise as a machine learning model for non-normally distributed, multicollinear data.

dataset, regression, transformation, (16 more...)

2111.02513

Country:

North America > United States > Idaho > Bonneville County > Idaho Falls (0.04)
North America > United States > Idaho > Bannock County > Pocatello (0.04)
North America > United States > Michigan > Wayne County > Livonia (0.04)
Asia > India > NCT > New Delhi (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Energy > Energy Storage (0.84)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceOct-23-2021, 12:00:40 GMT

Box-Cox Transformation for Normalizing a Non-normal Variable in R - Universe of Data Science

Box-Cox transformation is commonly used remedy when the normality is not met. This comherensive guide includes estimation techniques and use of Box-Cox transformation in practice. Find out how to apply Box-Cox transformation in R. In this tutorial, we will work on Box-Cox transformation in R. Firstly, we will mention two types of estimation techniques for Box-Cox transformation parameter. These are maximum likelihood estimation (MLE) and estimation via normality tests. Secondly, we will work how to apply Box-Cox transformation in practice.

box-cox transformation, estimation, transformation parameter, (11 more...)

Country:

North America > United States > New York (0.06)
North America > United States > California > Ventura County > Thousand Oaks (0.06)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

#artificialintelligenceMar-11-2021, 00:28:31 GMT

The Ultimate Scikit-Learn Machine Learning Cheatsheet - KDnuggets

All images were created by the author unless explicitly stated otherwise. Train-test-split is an important part of testing how well a model performs by training it on designated training data and testing it on designated testing data. This way, the model's ability to generalize to new data can be measured. In sklearn, both lists, pandas DataFrames, or NumPy arrays are accepted in X and y parameters. Training a standard supervised learning model takes the form of an import, the creation of an instance, and the fitting of the model.

dependence plot, dimension, information, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.31)

#artificialintelligenceNov-14-2020, 14:13:02 GMT

How to detect heteroscedasticity and rectify it?

One of the important assumptions of linear regression is that, there should be no heteroscedasticity of residuals. In simpler terms, this means that the variance of residuals should not increase with fitted values of response variable. In this post, I am going to explain why it is important to check for heteroscedasticity, how to detect it in your model? If is present, how to make amends to rectify the problem, with example R codes. This process is sometimes referred to as residual analysis.

heteroscedasticity, regression model, transformation, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.44)